Skip to content

Conversation

@hginjgerx
Copy link

This PR is an RFC to support dynamically loading LTTng tracing.

As mentioned in our brief discussion previously in #1587, there are some benefits to supporting this:

  1. Regular driver libraries can still be built with -DENABLE_LTTNG even if LTTng is not installed.
  2. Applications directly relying on driver libraries (e.g., perftest) won't inherit the LTTng dependency.
  3. No additional dependencies or performance penalty will be introduced to regular libraries. Users not needing tracing can install rdma-core and run applications as usual, while tracing users can simply install LTTng and preload the tracing libraries without rebuilding rdma-core

We believe that this can greatly improve the usability of LTTng tracing.

BTW, the first patch is included incidentally and isn’t actually part of this feature. It’s meant to fix the static compilation failures when enabling LTTng.

wenglianfa added 4 commits July 2, 2025 20:12
Currently static compilation with LTTng tracing enabled fails with
the following errors:

In file included from /home/rdma-core/providers/rxe/rxe_trace.c:9:
/rdma-core/providers/rxe/rxe_trace.h:12:38: fatal error: rxe_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "rxe_trace.h"
      |                                      ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/rxe/CMakeFiles/rxe.dir/build.make:76: providers/rxe/CMakeFiles/rxe.dir/rxe_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/efa/efa_trace.c:9:
/home/rdma-core/providers/efa/efa_trace.h:12:38: fatal error: efa_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "efa_trace.h"
      |                                      ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/efa/CMakeFiles/efa-static.dir/build.make:76: providers/efa/CMakeFiles/efa-static.dir/efa_trace.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3085: providers/efa/CMakeFiles/efa-static.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/mlx5/mlx5_trace.c:9:
/home/rdma-core/providers/mlx5/mlx5_trace.h:12:38: fatal error: mlx5_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "mlx5_trace.h"
      |                                      ^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/mlx5/CMakeFiles/mlx5-static.dir/build.make:76: providers/mlx5/CMakeFiles/mlx5-static.dir/mlx5_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/hns/hns_roce_u_trace.c:9:
/home/rdma-core/providers/hns/hns_roce_u_trace.h:12:38: fatal error: hns_roce_u_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
      |                                      ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

Fix it by linking the library and including drivers' directories for
static compilation.

Fixes: 382b359 ("efa: Add support for LTTng tracing")
Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Create extra provider libraries for tracing so that the regular
libraries does not need to have a dependency on LTTng. For
example, there will be a new libhns_trace-rdmav*.so for hns
tracing.

Usage example:
$ lttng create my_session
$ lttng enable-event -u rdma_core_hns:post_send
$ lttng start
$ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0
$ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0 10.10.10.10
$ lttng stop
$ lttng view

No additional dependencies or performance penalty will be introduced
if users don't load the tracing library explicitly as shown above.

This change involves all providers that support LTTng tracing,
including efa, hns, mlx5 and rxe.

Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Define rdma_tracepoint() in the common trace.h to remove duplicate
definition in drivers.

Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Now that tracing libraries has been separated from regular providers
libraries, enabling LTTng tracing by default has become feasible for
release version rdma-core. Users can customize the installation of
the tracing libraries according to their needs, improving the
usability.

Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
@rleon
Copy link
Member

rleon commented Jul 7, 2025

And I still think that providing extra, special tracing libraries as part of rdma-core is wrong approach.

@natoscott
Copy link

And I still think that providing extra, special tracing libraries as part of rdma-core is wrong approach.

I came here looking for tracing in rdma-core too, after recent involvement in debugging a RoCE setup. One issue we encountered was the way a number of the APIs pass back a NULL and no other error context on failure (errno from failed ioctls would have been desirable), making it difficult to triage many classes of failure.

Instead of LTTng, would it be possible to embed support for USDT style traces?

The eBPF folk have recently provided a header-only solution - https://github.com/libbpf/usdt/blob/main/usdt.h - which I've had good success with in two other projects. It requires no additional runtime or build dependencies, as they encourage embedding that one header file directly into your project. The USDT implementation involves insertion of noop instructions at instrumentation points, and some additional annotations in the library (on-disk), so it is essentially zero overhead when the traces are not in use.

In my case earlier, traces including the errno on API failure paths where NULL-return-indicates-failure would be very useful to have readily available.

@rleon
Copy link
Member

rleon commented Nov 10, 2025

And I still think that providing extra, special tracing libraries as part of rdma-core is wrong approach.

I came here looking for tracing in rdma-core too, after recent involvement in debugging a RoCE setup. One issue we encountered was the way a number of the APIs pass back a NULL and no other error context on failure (errno from failed ioctls would have been desirable), making it difficult to triage many classes of failure.

Instead of LTTng, would it be possible to embed support for USDT style traces?

The eBPF folk have recently provided a header-only solution - https://github.com/libbpf/usdt/blob/main/usdt.h - which I've had good success with in two other projects. It requires no additional runtime or build dependencies, as they encourage embedding that one header file directly into your project. The USDT implementation involves insertion of noop instructions at instrumentation points, and some additional annotations in the library (on-disk), so it is essentially zero overhead when the traces are not in use.

In my case earlier, traces including the errno on API failure paths where NULL-return-indicates-failure would be very useful to have readily available.

We can support both, patches are welcomed.

natoscott added a commit to natoscott/rdma-core that referenced this pull request Nov 13, 2025
Default to providing lightweight USDT trace points when LTTng
is unavailable.  This piggybacks on the existing tracing code
added for LTTng for a minimal set of changes.

> $ sudo bpftrace -l usdt:build/lib/lib*.so:*
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_recv
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_send
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:process_completion
> usdt:build/lib/libefa.so:rdma_core_efa:post_recv
> usdt:build/lib/libefa.so:rdma_core_efa:post_send
> usdt:build/lib/libefa.so:rdma_core_efa:process_completion
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:poll_cq
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_recv
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_send
> usdt:build/lib/libhns.so:rdma_core_hns:poll_cq
> usdt:build/lib/libhns.so:rdma_core_hns:post_recv
> usdt:build/lib/libhns.so:rdma_core_hns:post_send
> usdt:build/lib/libmlx5-rdmav59.so:rdma_core_mlx5:post_send
> usdt:build/lib/libmlx5.so:rdma_core_mlx5:post_send
> usdt:build/lib/librxe-rdmav59.so:rdma_core_rxe:post_send

The USDT header used here is from the libbpf/usdt project at
https://github.com/libbpf/usdt.git

Further background discussion for this commit is included in
linux-rdma#1621

Signed-off-by: Nathan Scott <[email protected]>
natoscott added a commit to natoscott/rdma-core that referenced this pull request Nov 13, 2025
Default to providing lightweight USDT trace points when LTTng
is unavailable.  This piggybacks on the existing tracing code
added for LTTng for a minimal set of changes.

> $ sudo bpftrace -l usdt:build/lib/lib*.so:*
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_recv
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:post_send
> usdt:build/lib/libefa-rdmav59.so:rdma_core_efa:process_completion
> usdt:build/lib/libefa.so:rdma_core_efa:post_recv
> usdt:build/lib/libefa.so:rdma_core_efa:post_send
> usdt:build/lib/libefa.so:rdma_core_efa:process_completion
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:poll_cq
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_recv
> usdt:build/lib/libhns-rdmav59.so:rdma_core_hns:post_send
> usdt:build/lib/libhns.so:rdma_core_hns:poll_cq
> usdt:build/lib/libhns.so:rdma_core_hns:post_recv
> usdt:build/lib/libhns.so:rdma_core_hns:post_send
> usdt:build/lib/libmlx5-rdmav59.so:rdma_core_mlx5:post_send
> usdt:build/lib/libmlx5.so:rdma_core_mlx5:post_send
> usdt:build/lib/librxe-rdmav59.so:rdma_core_rxe:post_send

The USDT header used here is from the libbpf/usdt project at
https://github.com/libbpf/usdt.git

Further background discussion for this commit is included in
linux-rdma#1621

Signed-off-by: Nathan Scott <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants